Discovering Interesting Patterns in Numerical Data with Background Knowledge

نویسنده

  • Szymon Jaroszewicz
چکیده

Association rule mining (Agrawal, Imielinski & Swami, 1993) is one of the most important data mining tasks. Initially only simple conjunctions of items were allowed as patterns, but generalizations of the framework to other pattern types such as sequences, trees, graphs etc. have been developed, significantly expanding its applicability. Curiously there has been relatively little effort devoted to generalizing association rules to numerical attributes, despite the practical ubiquity of numerical data. The main approach to mining numerical data has been discretization (Srikant & Agrawal, 1996). In this approach numerical attributes are split into a number of discrete intervals, after which the data can be mined using standard techniques. Discretization however has several problems. First, discretizing the attributes leads to information loss. Second, each interval contains only a small portion of the data which can lead to statistical estimation problems. The third problem is that relationships between atAbstrAct

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Are We Really Discovering “Interesting” Knowledge From Data?

This paper is a critical review of the literature on discovering comprehensible, interesting knowledge (or patterns) from data. The motivation for this review is that the majority of the literature focuses only on the problem of maximizing the accuracy of the discovered patterns, ignoring other important pattern-quality criteria that are user-oriented, such as comprehensibility and interestingn...

متن کامل

A new Approach for Handling Numeric Ranges for Graph-Based Knowledge Discovery

Discovering interesting patterns from structural domains is an important task in many real world domains. In recent years, graph-based approaches have demonstrated to be a straight forward tool to mine structural data. However, not all graph-based knowledge discovery algorithms deal with numerical attributes in the same way. Some of the algorithms discard the numeric attributes during the prepr...

متن کامل

An efficient Bayesian network approach for discovering interesting patterns

The main problem faced by all association rule/pattern mining algorithms is their production of a large number of rules which incurred a secondary mining problem; namely, mining interesting association rules/patterns. The problem is compounded by the fact that ‘common knowledge’ discovered rules are not interesting, but they are usually strong rules with high support and confidence levels – the...

متن کامل

Mining Of Spatial Co-location Pattern from Spatial Datasets

Spatial data mining, or knowledge discovery in spatial database, refers to the extraction of implicit knowledge, spatial relations, or other patterns not explicitly stored in spatial databases. Spatial data mining is the process of discovering interesting characteristics and patterns that may implicitly exist in spatial database. A huge amount of spatial data and newly emerging concept of Spati...

متن کامل

1 On Knowledgeable Unsupervised Text Mining

Text Mining is about discovering novel, interesting and useful patterns from textual data. In this paper we discuss several means that introduce background knowledge into unsupervised text mining in order to improve the novelty, the interestingness or the usefulness of the detected patterns. Germane to the different proposals is that they strive for higher abstractions that carry more explanato...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016